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ABSTRACT 



We apply Principal Component Analysis (PCA) to study 
the variability of the X-ray continuum in the Seyfert 1 
galaxy NGC 7469. The PCA technique is used to sep- 
arate out linear components contributing to variability 
between multiple datasets; the technique is often used 
in analysis of optical spectra, but has rarely been ap- 
plied to AGN X-ray spectroscopy. Running a PCA al- 
gorithm on 0.3-10 keV EPIC data from a 150 ks XMM- 
Newton observation of NGC 7469, we describe the spec- 
tral components extracted and evaluate the usefulness of 
the PCA technique for understanding the X-ray contin- 
uum in AGN. 
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1. INTRODUCTION AND METHODOLOGY 



Principal Component Analysis (PCA) is a technique 
which finds linear combinations of the spectral compo- 
nents that produce most of the variability in a time se- 
ries of spectra. Using only the first few of these compo- 
nents (the 'principal ones') one can describe the time se- 
ries with a greatly reduced number of parameters. These 
components may correspond to physical parameters of 
the system that are changing, so the method may be used 
to discover how many parameters there are, and how they 
change with time. 

In order to calculate the components, first the mean of 
each variable is subtracted. This corresponds to the 
zero" 1 order component. Next, each variable is scaled by 
its standard deviation. Generally, the correlation matrix 
of these shifted, scaled variables is then obtained. The 
mathematics is simplified, however, by using the method 
of singular value decomposition to find the eigenvalues 
and eigenvectors. With this technique, the correlation 
matrix is not explicit ly required, and the raw data matrix 
is used instead (Mittaz et al., 1990). The output of the 



PCA consists of a set of principal components, a matrix 
of coefficients and a set of eigenvalues. Each input spec- 
trum can be reconstructed from the components once they 
have been multiplied by the appropriate set of coefficients 
which determine how much of an effect each component 
has in a given spectrum. 

A problem with Principal Component Analysis is that 
it strictly only works when the signal is a superposition 
of linear components. If there is a non-linear interac- 
tion between components then it will 'incorrectly' sep- 
arate them. For example, if the time series consists of the 
spectrum of a Gaussian line profile which changes width 
over time, then the components will be similar to Fourier 
modes. These modes can be combined to produce a line 
of any width, but do not individually correspond to phys- 
ical reality. If there is a combination of non-linear effects 
operating simultaneously within the system, then inter- 
pretation can be difficult. Statistical noise in the data is 
another potential source of confusion. This introduces 
extra principal components to describe the noise fluctua- 
tions, which obviously need to be removed as they do not 
describe physical processes of interest. This can be done 
by truncating the list of components in the correct place, 
by assuming that the weaker components (those with 
smaller eigenvalues) correspond to the Fourier modes for 
the broad-band noise. 

NGC 7469 is a bright nearby Seyfert (z = 0.0164) which 
was observed for 150 ks by XMM-Newton in Novem- 
ber/December 2004. The total exposure time was split 
into two parts over consecutive orbits. This long obser- 
vation was obtained primarily for the purposes of high 
resolution spectroscopy with the RGS; this spectrum con- 
tains evidence of absorption by outflowing material with 
at least two different levels of ionisation. In order to cor- 
rectly model the soft X-ray spectral features, we need 
to understand the continuum underlying them. Fig. [l] 
shows the EPIC-pn spectra from the whole of the obser- 
vation and from both parts separately. The solid line is 
a power-law model with galactic absorption fitted to the 
3—5.5 keV and 7—10 keV ranges. The lower panel of 
this plot shows the ratios of the three observed spectra to 
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Figure 1. EPIC-pn spectrum of NGC 7469 from the first 
and second part of the observation, and from the whole 
observation combined. Also the ratios of these spec- 
tra to a Galactic-absorbed power-law model fitted to the 
3—5.5 keV and 7—10 keV ranges. 

the power-law fit. The soft excess is clearly variable, so 
we applied PCA to see if it could separate out different 
varying componen t s in th e EPIC-pn spectrum (see e.g. 
IVauehan & Fabianl 120041). We r an a PCA code (based 
on that of Francis & Wills! [19991) on a series of twelve 
~ 10 ks EPIC-pn spectra (six from each part of the ob- 
servation). This exposure time was chosen as a compro- 
mise between obtaining good time resolution to look for 
changing components in the spectrum and having suffi- 
cient signal-to-noise in each spectrum. 



2. RESULTS AND DISCUSSION 



The PCA code generates as many principal components 
as there are input datasets, in this case twelve. The eigen- 
values for each component indicate how much of the 
spectral variability that particular component is respon- 
sible for. In this case, the first two principal components 
seem to be significant, being responsible for ~18% and 
~13% of the variability respectively. The other ten com- 
ponents appear to be noise. Fig. |3 and [3] show compo- 
nents 1 and 2 added to the mean spectrum (for clarity, 
only the components plus mean showing the greatest ex- 
tent of the variability are plotted). These plots indicate the 
presence of a variable hard power-law and a variable soft 
power-law, with the hard power-law being more variable 
than the soft. The iron Ka line seems only to be present 
in the constant mean spectrum but not in the individual 
components, thus implying that it is not variable on these 
timescales — or that any variability is below the level of 
the statistical noise in the input spectra. 

Since PCA will reduce a non-linear variability to a se- 
ries of Fourier-type components, are these two compo- 
nents genuinely physical or simply Fourier deconstruc- 
tions of a much more complicated reality? A good sign 
may be that there are only two apparently non-noise com- 
ponents; if the spectrum had been varying in a more com- 
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Figure 2. Maximum and minimum spectra (added to the 
mean) for the most variable component. 
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Figure 3. Maximum and minimum spectra (added to the 
mean) for the second most variable component. 

plicated way, a larger number of power-law type compo- 
nents might have been generated than the two plausible 
ones we see here. On the subject of whether PCA tells 
us anything about the spectrum of NGC 7469 that we did 
not already know via less sophisticated methods (there is 
a soft and a hard component, varying in different ways), 
we conclude that it does not provide us with new informa- 
tion; the method does however give a model-independent 
confirmation of current knowledge. 
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